clear
set more off
macro drop all
capture log close

/********************************************************************************
Discrimination in Multi-Phase Systems: Evidence from Child Protection
Clean the Allegations Match Data

Created on: 11/30/17

Last Modified on: 2/20/2024

Description: This file cleans the raw allegations match data.

Note that we have removed the file directory names from this program for 
confidentiality reasons.
********************************************************************************/

** Setting directories 
global rawdata 
global cleandata 
global tmp 

/********************************************************************************

There are 2 sections of this do file:
1) Clean Variables
2) Collapse to Child*Case level

*******************************************************************************/

*************************
**1) CLEAN VARIABLES
*************************

**Load raw allegations match data
use "${rawdata}alleg_match.dta", clear

rename intakechildvicpartyid vicid
la var vicid "Child Victim ID"

la var intake_id "Intake Case ID"

gen intake_dt=date(intake_date,"YMD")
drop intake_date
rename intake_dt intake_date
la var intake_date "Intake Date"

gen removal_date=date(rmvl_dt,"YMD")
drop rmvl_dt
la var removal_date "Child Removal Date"

**Don't need the preponderance flag since it's in allegations file
drop prep time 

duplicates report vicid intake_id removal_date

/*
--------------------------------------
   copies | observations       surplus
----------+---------------------------
        1 |        77728             0
--------------------------------------
*/

compress
sort vicid intake_id removal_date
save "${cleandata}alleg_match_clean.dta", replace

*************************
**2) COLLAPSE TO CHILD BY CASE LEVEL
*************************

**Load clean allegations data
use "${cleandata}alleg_match_clean.dta", clear

**Create var first removal date associated with each case
bysort vicid intake_id: egen removal_date1=min(removal_date)

la var removal_date "First Removal Date Associated with Investigation"

egen tag=tag(vicid intake_id)
keep if tag==1
drop tag

keep vicid intake_id removal_date

sort vicid intake_id
compress
save "${cleandata}alleg_match_child_case_level.dta", replace





